Analyzing the Eurovision with Snowpark

Python is the language of choice for many Data Scientists. It has a rich set of packages for different Data Science tasks. Therefore, many Data Scientists were excited when Snowflake announced integrations with Python in their database.

To faciliate integrations to other programming languages Snowflake has created Snowpark. With Snowpark you can create user defined functions and stored procedures to run Python scripts that, for example, train a Machine Learning model and writes the result to the database, without having to move the data out of the database. There are tons of tutorials on how to do this. Today I’d like to focus on another aspect of what we as Data Scientists do. As a Data Scientist I do not spend most of my time building ML-models. Most of my work is centered around doing exploratory analysis, digging into unseen data, making visualizations and sharing these with stakeholders.

In this blogpost I’d like to share with you how we can use the Snowpark DataFrame API together with Quarto to create a dynamic presentation that analyzes the Eurovision song contest, without having to open Microsoft Powerpoint or Google Slides.

On notebooks

Many Data Scientists uses Jupyter Notebook to do data analysis. In a Jupter Notebook you can combine code with plots and text to comment your findings. However, when you want to bring a notebook into the hands of a stakeholder, you rarely want to send a raw notebook. In my case I usually want to put together a report or a presentation. There are ways to do this with Jupyter, but it always feels a bit clumsy.

Enter Quarto

Quarto is an open-source scientific and technical publishing system built on Pandoc. It is very similar to a Jupyter Notebook (and is actually powered by Jupyter) but the output is, in my view, prettier. Because it is based on Pandoc you can render your code into a report, a presentation or even a website. If you are familiar to Rmarkdown Quarto basically works in the same way but is language agnostic.

Development setup for Quarto

For my Python work I mainly use VSCode. One good thing about Quarto is that it has an excellent VSCode plugin. You can follow the instructions here.

Creating a virtual environment for Snowpark

I use conda to create virtual environments for Python. In this case I have a conda.yml where I list package dependencies. The yaml-file comes from a Snowflake tutorial.

conda env create -f conda_env.yml
conda activate pysnowpark

To be able to use the environment in Quarto I need to register it to the jupyter kernel.

python -m ipykernel install --user --name=pysnowpark

Set up the Quarto document

Via the command palette in VSCode you can easily create a new .qmd document (or .ipynb if you want the Jupyter notebook experience)

I’m using .qmd. In the header I can now specify my virtual environment.

---
title: "Reporting with Python and Snowflake in Quarto"
format: html
jupyter: pysnowpark
---

Now you can just add python code chunks to your document. I will be using plotly to create interactive plots in our report.

import plotly.express as px
fig = px.scatter(x = [0, 1, 2, 3, 4], y = [0, 1, 4, 9, 16])
fig.show()
    <script type="text/javascript">
    window.PlotlyConfig = {MathJaxConfig: 'local'};
    if (window.MathJax && window.MathJax.Hub && window.MathJax.Hub.Config) {window.MathJax.Hub.Config({SVG: {font: "STIX-Web"}});}
    if (typeof require !== 'undefined') {
    require.undef("plotly");
    requirejs.config({
        paths: {
            'plotly': ['https://cdn.plot.ly/plotly-2.12.1.min']
        }
    });
    require(['plotly'], function(Plotly) {
        window._Plotly = Plotly;
    });
    }
    </script>
    
                        <div id="df86bfd6-7355-4ab4-8845-e9585ae07cac" class="plotly-graph-div" style="height:525px; width:100%;"></div>            <script type="text/javascript">                require(["plotly"], function(Plotly) {                    window.PLOTLYENV=window.PLOTLYENV || {};                                    if (document.getElementById("df86bfd6-7355-4ab4-8845-e9585ae07cac")) {                    Plotly.newPlot(                        "df86bfd6-7355-4ab4-8845-e9585ae07cac",                        [{"hovertemplate":"x=%{x}<br>y=%{y}<extra></extra>","legendgroup":"","marker":{"color":"#636efa","symbol":"circle"},"mode":"markers","name":"","orientation":"v","showlegend":false,"x":[0,1,2,3,4],"xaxis":"x","y":[0,1,4,9,16],"yaxis":"y","type":"scatter"}],                        {"template":{"data":{"histogram2dcontour":[{"type":"histogram2dcontour","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"choropleth":[{"type":"choropleth","colorbar":{"outlinewidth":0,"ticks":""}}],"histogram2d":[{"type":"histogram2d","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"heatmap":[{"type":"heatmap","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"heatmapgl":[{"type":"heatmapgl","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"contourcarpet":[{"type":"contourcarpet","colorbar":{"outlinewidth":0,"ticks":""}}],"contour":[{"type":"contour","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"surface":[{"type":"surface","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"mesh3d":[{"type":"mesh3d","colorbar":{"outlinewidth":0,"ticks":""}}],"scatter":[{"fillpattern":{"fillmode":"overlay","size":10,"solidity":0.2},"type":"scatter"}],"parcoords":[{"type":"parcoords","line":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scatterpolargl":[{"type":"scatterpolargl","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"bar":[{"error_x":{"color":"#2a3f5f"},"error_y":{"color":"#2a3f5f"},"marker":{"line":{"color":"#E5ECF6","width":0.5},"pattern":{"fillmode":"overlay","size":10,"solidity":0.2}},"type":"bar"}],"scattergeo":[{"type":"scattergeo","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scatterpolar":[{"type":"scatterpolar","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"histogram":[{"marker":{"pattern":{"fillmode":"overlay","size":10,"solidity":0.2}},"type":"histogram"}],"scattergl":[{"type":"scattergl","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scatter3d":[{"type":"scatter3d","line":{"colorbar":{"outlinewidth":0,"ticks":""}},"marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scattermapbox":[{"type":"scattermapbox","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scatterternary":[{"type":"scatterternary","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scattercarpet":[{"type":"scattercarpet","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"carpet":[{"aaxis":{"endlinecolor":"#2a3f5f","gridcolor":"white","linecolor":"white","minorgridcolor":"white","startlinecolor":"#2a3f5f"},"baxis":{"endlinecolor":"#2a3f5f","gridcolor":"white","linecolor":"white","minorgridcolor":"white","startlinecolor":"#2a3f5f"},"type":"carpet"}],"table":[{"cells":{"fill":{"color":"#EBF0F8"},"line":{"color":"white"}},"header":{"fill":{"color":"#C8D4E3"},"line":{"color":"white"}},"type":"table"}],"barpolar":[{"marker":{"line":{"color":"#E5ECF6","width":0.5},"pattern":{"fillmode":"overlay","size":10,"solidity":0.2}},"type":"barpolar"}],"pie":[{"automargin":true,"type":"pie"}]},"layout":{"autotypenumbers":"strict","colorway":["#636efa","#EF553B","#00cc96","#ab63fa","#FFA15A","#19d3f3","#FF6692","#B6E880","#FF97FF","#FECB52"],"font":{"color":"#2a3f5f"},"hovermode":"closest","hoverlabel":{"align":"left"},"paper_bgcolor":"white","plot_bgcolor":"#E5ECF6","polar":{"bgcolor":"#E5ECF6","angularaxis":{"gridcolor":"white","linecolor":"white","ticks":""},"radialaxis":{"gridcolor":"white","linecolor":"white","ticks":""}},"ternary":{"bgcolor":"#E5ECF6","aaxis":{"gridcolor":"white","linecolor":"white","ticks":""},"baxis":{"gridcolor":"white","linecolor":"white","ticks":""},"caxis":{"gridcolor":"white","linecolor":"white","ticks":""}},"coloraxis":{"colorbar":{"outlinewidth":0,"ticks":""}},"colorscale":{"sequential":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]],"sequentialminus":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]],"diverging":[[0,"#8e0152"],[0.1,"#c51b7d"],[0.2,"#de77ae"],[0.3,"#f1b6da"],[0.4,"#fde0ef"],[0.5,"#f7f7f7"],[0.6,"#e6f5d0"],[0.7,"#b8e186"],[0.8,"#7fbc41"],[0.9,"#4d9221"],[1,"#276419"]]},"xaxis":{"gridcolor":"white","linecolor":"white","ticks":"","title":{"standoff":15},"zerolinecolor":"white","automargin":true,"zerolinewidth":2},"yaxis":{"gridcolor":"white","linecolor":"white","ticks":"","title":{"standoff":15},"zerolinecolor":"white","automargin":true,"zerolinewidth":2},"scene":{"xaxis":{"backgroundcolor":"#E5ECF6","gridcolor":"white","linecolor":"white","showbackground":true,"ticks":"","zerolinecolor":"white","gridwidth":2},"yaxis":{"backgroundcolor":"#E5ECF6","gridcolor":"white","linecolor":"white","showbackground":true,"ticks":"","zerolinecolor":"white","gridwidth":2},"zaxis":{"backgroundcolor":"#E5ECF6","gridcolor":"white","linecolor":"white","showbackground":true,"ticks":"","zerolinecolor":"white","gridwidth":2}},"shapedefaults":{"line":{"color":"#2a3f5f"}},"annotationdefaults":{"arrowcolor":"#2a3f5f","arrowhead":0,"arrowwidth":1},"geo":{"bgcolor":"white","landcolor":"#E5ECF6","subunitcolor":"white","showland":true,"showlakes":true,"lakecolor":"white"},"title":{"x":0.05},"mapbox":{"style":"light"}}},"xaxis":{"anchor":"y","domain":[0.0,1.0],"title":{"text":"x"}},"yaxis":{"anchor":"x","domain":[0.0,1.0],"title":{"text":"y"}},"legend":{"tracegroupgap":0},"margin":{"t":60}},                        {"responsive": true}                    ).then(function(){
                        

var gd = document.getElementById(‘df86bfd6-7355-4ab4-8845-e9585ae07cac’); var x = new MutationObserver(function (mutations, observer) {{ var display = window.getComputedStyle(gd).display; if (!display || display === ‘none’) {{ console.log([gd, ‘removed!’]); Plotly.purge(gd); observer.disconnect(); }} }});

// Listen for the removal of the full notebook cells var notebookContainer = gd.closest(‘#notebook-container’); if (notebookContainer) {{ x.observe(notebookContainer, {childList: true}); }}

// Listen for the clearing of the current output cell var outputEl = gd.closest(‘.output’); if (outputEl) {{ x.observe(outputEl, {childList: true}); }}

                    })                };                });            </script>        </div>

Connect to Snowflake

I have created a free 30 day trial for Snowflake. To be able to use Snowpark in Snowflake there are some more configurations you have to set up. You can find those here.

from snowflake.snowpark.session import Session
from keyring import get_password
import pandas as pd

service_id = 'snowflake-trial'

connection_parameters = {
    "account": get_password(service_id, 'accountname'),
    "user": get_password(service_id, 'username'),
    "password": get_password(service_id, 'password'),
    "role": "ACCOUNTADMIN",
    "database": "HOL_DB",
    "schema": "PUBLIC",
    "warehouse": "HOL_WH"
}

session = Session.builder.configs(connection_parameters).create()

Load some data to Snowflake

For our report we have some data on the Eurovision Song Contest.

We can write this data to Snowflake with Snowpark.

contestants_df = pd.read_csv('data/contestants.csv')
votes_df = pd.read_csv('data/votes.csv')

contestants_df.columns = map(str.upper, contestants_df.columns)
votes_df.columns = map(str.upper, votes_df.columns) # To make our dataset work with the Snowpark API we need to make the columns names upper case.

session.write_pandas(contestants_df, table_name = 'CONTESTANTS', auto_create_table = True, overwrite = True)
session.write_pandas(votes_df, table_name = 'VOTES', auto_create_table = True, overwrite = True)

Using Snowpark DataFrame API and Plotly together

Next, we can create a snowpark dataframe from the database table.

contestant_sdf = session.table('contestants')

This dataframe we can query using the Snowpark DataFrame API. It is like a combination between Pandas and SQL. To display the result or use it for anything else in a Python environment you can use to_pandas().

contestant_sdf.limit(6).to_pandas()
YEAR TO_COUNTRY_ID TO_COUNTRY PERFORMER SONG PLACE_CONTEST SF_NUM RUNNING_FINAL RUNNING_SF PLACE_FINAL PLACE_SF POINTS_SF POINTS_TELE_FINAL POINTS_JURY_FINAL POINTS_TELE_SF POINTS_JURY_SF COMPOSERS LYRICISTS LYRICS YOUTUBE_URL
0 1956 ch Switzerland Lys Assia Refrain 2.0 NaN 2.0 NaN 2.0 NaN NaN NaN NaN NaN NaN Georg Benz Stahl None (Refrain d’amour…), couleur du ci… https://youtube.com/watch?v=IyqIPvOkiRk
1 1956 nl Netherlands Jetty Paerl De Vogels Van Holland 2.0 NaN 1.0 NaN 2.0 NaN NaN NaN NaN NaN NaN Cor Lemaire Annie M. G. Schmidt De vogels van Holland zijn zo muzikaaller… https://youtube.com/watch?v=u45UQVGRVPA
2 1956 be Belgium Fud Leclerc Messieurs Les Noyés De La Seine 2.0 NaN 3.0 NaN 2.0 NaN NaN NaN NaN NaN NaN Jacques Say;Jean Miret Robert Montal Messieurs les noyés de la Seine-moi le… https://youtube.com/watch?v=U9O3sqlyra0
3 1956 de Germany Walter Andreas Schwarz Im Wartesaal Zum Großen Glück 2.0 NaN 4.0 NaN 2.0 NaN NaN NaN NaN NaN NaN Walter Andreas Schwarz None Es gibt einen Hafen, da fährt kaum ein Schiff... https://youtube.com/watch?v=BDNARIDnmTc
4 1956 fr France Mathé Altéry Le Temps Perdu 2.0 NaN 5.0 NaN 2.0 NaN NaN NaN NaN NaN NaN André Lodge Rachèle Thoreau Chante, carillonchant du temps perdu… https://youtube.com/watch?v=dm1L0XyikKI
5 1956 lu Luxembourg Michèle Arnaud Ne Crois Pas 2.0 NaN 13.0 NaN 2.0 NaN NaN NaN NaN NaN NaN Christian Guitreau None Si on te dit qu’t’a une belle gueulea pas … https://youtube.com/watch?v=Pv7GJkqtNuc

6 rows × 21 columns

Okey, so let’s build a simple plot that shows the number of wins per country in descending order. First, we can use Snowpark to compute the number of wins per country. By doing this we are writing Python code but the actual data aggregation is executed within Snowflake.

winners = (contestant_sdf
                .filter("place_final = 1")
                .group_by('to_country')
                .count()
                .to_pandas()
                .sort_values(by = 'COUNT', ascending = False)
                .head(17)
)

Next we visualize it with Plotly.

import plotly.express as px

fig = px.bar(winners, x = 'COUNT', y = 'TO_COUNTRY',
                labels = {
                     "COUNT": "Number of 1st place",
                     "TO_COUNTRY": "Country"
                 },
                title = "Number of Eurovision Song Contest wins since 1956")
fig.update_yaxes(autorange="reversed")

fig.show()
                        <div id="daf355ba-7002-416b-9d2f-fa66303abeed" class="plotly-graph-div" style="height:525px; width:100%;"></div>            <script type="text/javascript">                require(["plotly"], function(Plotly) {                    window.PLOTLYENV=window.PLOTLYENV || {};                                    if (document.getElementById("daf355ba-7002-416b-9d2f-fa66303abeed")) {                    Plotly.newPlot(                        "daf355ba-7002-416b-9d2f-fa66303abeed",                        [{"alignmentgroup":"True","hovertemplate":"Number of 1st place=%{x}<br>Country=%{y}<extra></extra>","legendgroup":"","marker":{"color":"#636efa","pattern":{"shape":""}},"name":"","offsetgroup":"","orientation":"h","showlegend":false,"textposition":"auto","x":[7,6,5,5,5,5,4,3,3,2,2,2,2,2,1,1,1],"xaxis":"x","y":["Ireland","Sweden","Netherlands","Luxembourg","France","United Kingdom","Israel","Norway","Denmark","Germany","Spain","Austria","Italy","Ukraine","Finland","Portugal","Russia"],"yaxis":"y","type":"bar"}],                        {"template":{"data":{"histogram2dcontour":[{"type":"histogram2dcontour","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"choropleth":[{"type":"choropleth","colorbar":{"outlinewidth":0,"ticks":""}}],"histogram2d":[{"type":"histogram2d","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"heatmap":[{"type":"heatmap","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"heatmapgl":[{"type":"heatmapgl","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"contourcarpet":[{"type":"contourcarpet","colorbar":{"outlinewidth":0,"ticks":""}}],"contour":[{"type":"contour","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"surface":[{"type":"surface","colorbar":{"outlinewidth":0,"ticks":""},"colorscale":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]]}],"mesh3d":[{"type":"mesh3d","colorbar":{"outlinewidth":0,"ticks":""}}],"scatter":[{"fillpattern":{"fillmode":"overlay","size":10,"solidity":0.2},"type":"scatter"}],"parcoords":[{"type":"parcoords","line":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scatterpolargl":[{"type":"scatterpolargl","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"bar":[{"error_x":{"color":"#2a3f5f"},"error_y":{"color":"#2a3f5f"},"marker":{"line":{"color":"#E5ECF6","width":0.5},"pattern":{"fillmode":"overlay","size":10,"solidity":0.2}},"type":"bar"}],"scattergeo":[{"type":"scattergeo","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scatterpolar":[{"type":"scatterpolar","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"histogram":[{"marker":{"pattern":{"fillmode":"overlay","size":10,"solidity":0.2}},"type":"histogram"}],"scattergl":[{"type":"scattergl","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scatter3d":[{"type":"scatter3d","line":{"colorbar":{"outlinewidth":0,"ticks":""}},"marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scattermapbox":[{"type":"scattermapbox","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scatterternary":[{"type":"scatterternary","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"scattercarpet":[{"type":"scattercarpet","marker":{"colorbar":{"outlinewidth":0,"ticks":""}}}],"carpet":[{"aaxis":{"endlinecolor":"#2a3f5f","gridcolor":"white","linecolor":"white","minorgridcolor":"white","startlinecolor":"#2a3f5f"},"baxis":{"endlinecolor":"#2a3f5f","gridcolor":"white","linecolor":"white","minorgridcolor":"white","startlinecolor":"#2a3f5f"},"type":"carpet"}],"table":[{"cells":{"fill":{"color":"#EBF0F8"},"line":{"color":"white"}},"header":{"fill":{"color":"#C8D4E3"},"line":{"color":"white"}},"type":"table"}],"barpolar":[{"marker":{"line":{"color":"#E5ECF6","width":0.5},"pattern":{"fillmode":"overlay","size":10,"solidity":0.2}},"type":"barpolar"}],"pie":[{"automargin":true,"type":"pie"}]},"layout":{"autotypenumbers":"strict","colorway":["#636efa","#EF553B","#00cc96","#ab63fa","#FFA15A","#19d3f3","#FF6692","#B6E880","#FF97FF","#FECB52"],"font":{"color":"#2a3f5f"},"hovermode":"closest","hoverlabel":{"align":"left"},"paper_bgcolor":"white","plot_bgcolor":"#E5ECF6","polar":{"bgcolor":"#E5ECF6","angularaxis":{"gridcolor":"white","linecolor":"white","ticks":""},"radialaxis":{"gridcolor":"white","linecolor":"white","ticks":""}},"ternary":{"bgcolor":"#E5ECF6","aaxis":{"gridcolor":"white","linecolor":"white","ticks":""},"baxis":{"gridcolor":"white","linecolor":"white","ticks":""},"caxis":{"gridcolor":"white","linecolor":"white","ticks":""}},"coloraxis":{"colorbar":{"outlinewidth":0,"ticks":""}},"colorscale":{"sequential":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]],"sequentialminus":[[0.0,"#0d0887"],[0.1111111111111111,"#46039f"],[0.2222222222222222,"#7201a8"],[0.3333333333333333,"#9c179e"],[0.4444444444444444,"#bd3786"],[0.5555555555555556,"#d8576b"],[0.6666666666666666,"#ed7953"],[0.7777777777777778,"#fb9f3a"],[0.8888888888888888,"#fdca26"],[1.0,"#f0f921"]],"diverging":[[0,"#8e0152"],[0.1,"#c51b7d"],[0.2,"#de77ae"],[0.3,"#f1b6da"],[0.4,"#fde0ef"],[0.5,"#f7f7f7"],[0.6,"#e6f5d0"],[0.7,"#b8e186"],[0.8,"#7fbc41"],[0.9,"#4d9221"],[1,"#276419"]]},"xaxis":{"gridcolor":"white","linecolor":"white","ticks":"","title":{"standoff":15},"zerolinecolor":"white","automargin":true,"zerolinewidth":2},"yaxis":{"gridcolor":"white","linecolor":"white","ticks":"","title":{"standoff":15},"zerolinecolor":"white","automargin":true,"zerolinewidth":2},"scene":{"xaxis":{"backgroundcolor":"#E5ECF6","gridcolor":"white","linecolor":"white","showbackground":true,"ticks":"","zerolinecolor":"white","gridwidth":2},"yaxis":{"backgroundcolor":"#E5ECF6","gridcolor":"white","linecolor":"white","showbackground":true,"ticks":"","zerolinecolor":"white","gridwidth":2},"zaxis":{"backgroundcolor":"#E5ECF6","gridcolor":"white","linecolor":"white","showbackground":true,"ticks":"","zerolinecolor":"white","gridwidth":2}},"shapedefaults":{"line":{"color":"#2a3f5f"}},"annotationdefaults":{"arrowcolor":"#2a3f5f","arrowhead":0,"arrowwidth":1},"geo":{"bgcolor":"white","landcolor":"#E5ECF6","subunitcolor":"white","showland":true,"showlakes":true,"lakecolor":"white"},"title":{"x":0.05},"mapbox":{"style":"light"}}},"xaxis":{"anchor":"y","domain":[0.0,1.0],"title":{"text":"Number of 1st place"}},"yaxis":{"anchor":"x","domain":[0.0,1.0],"title":{"text":"Country"},"autorange":"reversed"},"legend":{"tracegroupgap":0},"title":{"text":"Number of Eurovision Song Contest wins since 1956"},"barmode":"relative"},                        {"responsive": true}                    ).then(function(){
                        

var gd = document.getElementById(‘daf355ba-7002-416b-9d2f-fa66303abeed’); var x = new MutationObserver(function (mutations, observer) {{ var display = window.getComputedStyle(gd).display; if (!display || display === ‘none’) {{ console.log([gd, ‘removed!’]); Plotly.purge(gd); observer.disconnect(); }} }});

// Listen for the removal of the full notebook cells var notebookContainer = gd.closest(‘#notebook-container’); if (notebookContainer) {{ x.observe(notebookContainer, {childList: true}); }}

// Listen for the clearing of the current output cell var outputEl = gd.closest(‘.output’); if (outputEl) {{ x.observe(outputEl, {childList: true}); }}

                    })                };                });            </script>        </div>

Making a Quarto report

Okay, so now we know how run Python on Snowflake tables, make some pretty visualizations. Only thing left is making this into a report to share with stakeholders.

Once we have the database and the virtual environment ready putting together a Quarto report is really is easy. In Quarto we can use parameters in the code chunks to “echo” the code or hide it by setting #|echo: false in a code chunk. Usually the code is not that interesting to look at if you don’t know Python.

This is how the report looks.

But if we want to make it into a presentation that is as easy as changing the format:

---
title: "My Eurovision Slideshow"
jupyter: pysnowpark
format:
  revealjs: 
    theme: moon
---

You can take a look at the code for this report here.

Why do this?

Making beautiful reporting in Python has not been all that easy in the past. With Quarto we can create informative reports, slideshows and websites that gets our Python results closer to stakeholders. And all of this can be done without traditional reporting tools like Powerpoint or Google Slides, making our work more reproducible and less prone to error.

What a like about the Snowpark DataFrame API is that it brings data closer to the Data Scientist. I have seen many analysts and Data Scientists working primarily in the Snowflake online editor, download result of their queries and then visualize it. With Snowpark Data Scientists don’t have to switch between development environments.

This has the potential for making reporting a lot easier for Data Scientist.